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Abstract. We show that the Arzela-Ascoli theorem and Kolmogorov com- 
pactness theorem both are consequences of a simple lemma on compactness 
in metric spaces. Their relation to Helly's theorem is discussed. The paper 
contains a detailed discussion on the historical background of the Kolmogorov 
compactness theorem. 
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1. Introduction 

Compactness results in the spaces L P (R ) (1 < p < oo) are often vital in exis- 
tence proofs for nonlinear partial differential equations. A necessary and sufficient 
condition for a subset of L p (R d ) to be compact is given in what is often called the 
Kolmogorov compactness theorem, or Frechet-Kolmogorov compactness theorem. 
' Proofs of this theorem are frequently based on the Arzela-Ascoli theorem. We 

here show how one can deduce both the Kolmogorov compactness theorem and the 
Arzela-Ascoli theorem from one common lemma on compactness in metric spaces, 
which again is based on the fact that a metric space is compact if and only if it is 
complete and totally bounded. 

Furthermore, we trace out the historical roots of Kolmogorov's compactness 
' theorem, which originated in Kolmogorov's classical paper |18j from 1931. However, 

there were several other approaches to the issue of describing compact subsets 
of L p (R d ) prior to and after Kolmogorov, and several of these are described in 
qq , Section^ Furthermore, extensions to other spaces, say L p (R d ) (0 < p < 1), 

Orlicz spaces, or compact groups, are described. Helly's theorem is often used as 
a replacement for Kolmogorov's compactness theorem, in particular in the context 
' of nonlinear hyperbolic conservation laws, in spite of being more specialized (e.g., 

0^ . in the sense that its classical version requires one spatial dimension). For instance, 

Helly's theorem is an essential ingredient in Glimm's ground breaking existence 
proof for nonlinear hyperbolic systems [2]. We show below that Helly's theorem 
- is an easy consequence of Kolmogorov's compactness theorem. 

2. Preliminary results 

An e-cover of a metric space is a cover of the space consisting of sets of diameter 
at most s. A metric space is called totally bounded if it admits a finite e-cover for 
every e > 0. It is well known that a metric space is compact if and only if it is 
complete and totally bounded (see, e.g., [MJ p. 13]). Since we are interested in 
compactness results for subsets of Banach spaces, we may, and shall, concentrate 
our attention on total boundedness. 
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Here is the key lemma for many compactness results (in this lemma and its proof, 
every metric is named d): 

Lemma 1. Let X be a metric space. Assume that, for every e > 0, there exists 
some 8 > 0, a metric space W , and a mapping $: X — > W so that Q[X] is totally 
bounded, and whenever x,y G X are such that d($>(x), &(y)) < 8, then d(x,y) < e. 
Then X is totally bounded. 

Proof. For any e > 0, pick 6, W and $ as in the statement of the lemma. Since 
§>[X] is totally bounded, there exists a finite <5-cover {V\, . . . ,V„} of $>[X]. Then 
it immediately follows from the assumptions that {$ (Fi), . . . , (V n )} is an 
e-cover of X. Thus X is totally bounded. □ 

Lemma [1] embodies the main argument in the standard proof of the classical Arzela- 
Ascoli theorem, as we now demonstrate. 

Theorem 2 (Arzela-Ascoli). Let ft be a compact topological space. Then a subset 
o/C(0) is totally bounded in the supremum norm if, and only if, 

(i) it is pointwise bounded, and 

(ii) it is equicontinuous. 

Recall the definition of equicontinuity: Condition [ii) means that for every x G 
and every e > there is a neighborhood V of x so that |/(y) — f(x)\ < e for all 
y G V and all / in the given set of functions. 

Proof. Assume T C C(£!) is pointwise bounded and equicontinuous. Let e > 0. 
Combining the equicontinuity of F and compactness of Vt, we can find a finite set 
of points x%,...,x n G £l with neighborhoods V\, . . . , V n covering all of f2 so that 
\f(x) — f(xj)\ < s whenever / G J- and x G Vj. 
Define $ : T ->• R" by 

$(/) = (/(xi),..., /(*„)). 

By the pointwise boundedness of F, the image ^[J 7 ] is bounded, and hence totally 
bounded, in M™. 

Furthermore, if /, g G T with ||$(/) — ^sOUoo < £, then since any x G £1 belongs 
to some Vj, 

\f(x) - g(x)\ < \f(x) - f( Xj )\ + \f( Xj ) - g(xj)\ + \g( Xj ) - g{x)\ < 3e, 

and so ||/ — g\\oo < 3e. By LemmaU T is totally bounded. 

For the converse, assume that J 7 is a totally bounded subset of C(O). 

The existence of a finite e-cover for T , for any e, clearly implies the bound- 
edness of J- ', thus establishing the uniform boundedness and hence also pointwise 
boundedness of J- ' . 

To prove equicontinuity, let x G fi and e > be given. Pick an e-cover 
{U\, . . . , U n } of J 7 , and chose gj G Uj for j = 1, . . . , n. Pick a neighborhood Vj of x 
so that \gj(y) — gj{x)\ < e whenever y G Vj, for j = 1, . . . , n. Let V = Vi (!• ■ ■ D V m . 
If / G Uj then ||/ 

— ffjlloo 5; £j an d so when y G V, 
- < l/(y) - + \9j(y) - 9j(x)\ + \9j(x) - f(x)\ < 3s, 

which proves equicontinuity. □ 

Remark 3. This theorem was first proved by Ascoli 3J for equi-Lipschitz functions 
and extended by Arzela 2 to a general family of equicontinuous functions. See [H 
p. 203]. 
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We present the following theorem, first proved by Frechet [T2] for the case p = 2, 
as a warm-up exercise, as the proof is short and nicely exposes some key ideas for 
the proof of Theorem \5\ 

Theorem 4. A subset of ' V , where 1 < p < oo, is totally bounded if, and only if, 

(i) it is pointwise bounded, and 

(ii) for every e > there is some n so that, for every x in the given subset, 

Proof. Assume that T C V satisfies the two conditions. Given e > 0, pick n as in 
the second condition, and define a mapping $ : T — > M. n by 

= (x%, . . .,x n ). 

By the pointwise boundedness of T, the image §(T) is totally bounded. 
If x,y G T with - <%)| p = (£LiK - 2/fc| p ) 1/P < e, then 

\\x-y\\ p <(j2\ x k-y k \ p ) "+(X>*-Mkl p ) P <£ + 2e = 3 £ . 

fc— 1 fc>n 

By Lemma [TJ J 7 is totally bounded. 

We will leave proving the converse as an exercise to the reader. The techniques from 
the proof of Theorem |5] are easily adapted. See also the proof of Theorem [S] □ 

3. The Kolmogorov-Riesz theorem 

Theorem 5 (Kolmogorov-Riesz). Let 1 < p < oo. A subset J 7 of L p (M. n ) is totally 
bounded if, and only if, 

(i) J- is bounded, 

(ii) for every e > there is some R so that, for every f G J- , 



\f(x)\?dx<eP, 

'\x\>R 

(Hi) for every e > there is some p > so that, for every f G J- and y G W 1 
with \y\ < p, 

I \f(x + y)-f(x)\*dx<e*. 

Proof. Assume that T C L p (W l ) satisfies the three conditions. First, given e > 0, 
pick R as in the second condition, and p as in the third condition. 

Let Q be an open cube centered at the origin so that \y\ < |/j whenever y G Q. 
Let Qi, . . . , Qn be mutually non-overlapping translates of Q so that the closure of 
Ui Qi contains the ball with radius R centered at the origin. Let P be the projection 
map of L p (M. n ) onto the linear span of the characteristic functions of the cubes Qi 
given by 

i = l,...,N, 

otherwise. 

From (ii) and the definition of Pf we find, for / G T , 

N 

||/ -p/||£ + £ / \f(x)-Pf(x)\?dx 



pf( X ) = {mL f{z)dz > xeQt 

[ otherw 



N 



' + £/jKSi/* w,, - /w) 



dz 



v 

dx. 
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Next we use Jensen's inequality and change a variable of integration, where we note 
that x — z G 2Q when x, z G Qf. 

N 

\\f - Pf\\P < a* + J2 [ / \f(x)-f(z)\ p dzdx 

i=1 jQi JQ t 



N 



^ £P + E/ I7TT / \f(x)-f(x + v)\ P dydx 

JQi J2Q 



1=1 



<£ P + |4 / / \f(x)-f(x + y)\ P dxdy 



1Q JR™ 

<e p +^ / e p dy = (2™ + l)e p 

I2Q 



\Q\ 



by {in). Thus ||/-P/|| P < (2" + l) 1/p e, and ||/|| p < (2 n + l) 1 ^ + ||p/|| p . By the 
linearity of P, if f,g G P and ||P/-P 5 || p < e then ||/-ff|| p < ((2 n + + l)e. 
Moreover, since P is bounded (in fact ||P|| = 1) and P is bounded by (i), the 
image P[P] is bounded. Since the image of P is finite dimensional, P[P] is totally 
bounded. Thus P is totally bounded by Lemma [TJ 

For the converse, assume that P is totally bounded. 

The existence of a finite e-cover for P, for any e, clearly implies the boundedness 
of P, thus establishing Condition (i). 

To establish Condition (it), let £ > be given, let {Pi, . . . , U n } be an e-cover of 
P, and chose gj G LX,- for j = 1, . . . , n. Select R so that 

\g 3 (x)\ p dx < e p , j = l,...,m. 

x>R 

If / G P,- then ||/ — 9j\\p — an( i so 

Vp / /" , , , , , \ 1/p / f , , , \ 1/p 



(/ \f(x)\'dx) <(/ l/M-^^rdx) +(/ | 5 ,(x)| p dx) 

y Jx>R ' y Jx>R ' y Jx>R 

<\\f-g 3 \\ P +(f \g 3 (x)\ p dx) 1/P <2e, 



thus establishing Condition (ii). 

Condition (Hi) is established similarly, by noting that the inequality of the con- 
dition is easily established for any single function / G L p (W l ), for example using 
the fact that C C °°(R") is dense in IJ>(W l ). Then, picking an e-cover {U x ,...,U n } 
and gj € Uj for each j as in the previous paragraph, given e > we can find p > 
with 



\gj(x + y) - g.j(x)\ p dx < e p , \y\ < p, j = l,...,m. 
Again, if / G Uj we find 

(/ \f(x + y)-f(x)\ p dx) 1,P <n \f(x + y)- gj (x + y)\ p dx) 1/P 

+ ([ \g 3 (x + y)- 9j (x)\ p dx) 1/P 

+ (/ \ 9j (x)-f(x)\ p dx) 1/P 
< 3e, 

and the proof is complete. □ 
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Remark 6. (I) A singleton set is clearly totally bounded, yet Condition (Hi) is not 
obvious for a singleton set at first glance. However, it follows easily from the density 
of the space of smooth functions with compact support in LP . 

(II) In applications, one sometimes constructs a sequence /i,/2,... in L p satis- 
fying the first two conditions of Theorem [5] and the condition 

\ i/p 

\fn(x + y) - fn(x)\dx) <a(y)+f3(n), lim a(y) = 0, lim /3(n) = 0. 

/ n— >-oo 

Then for some N and 5 > 0, the right-hand side of the above inequality is less than 
e for all n > N and \y\ small enough. By the fact noted in the previous paragraph, 
we can choose a smaller upper bound for \y\ to make the integral smaller than e for 
n = 1, 2, . . . ,N. Thus {/i, fi, . . .} satisfies Condition (Hi), and hence a convergent 
subsequence exists. 

An interesting corollary to the Kolmogorov theorem is the following result, see 
|22| , which also contains a variant using the uniform smoothness of the functions in 
T and their Fourier transforms. See also [TJ, which contains an alternate formulation 
based on the short-time Fourier transform, as well as one based on the wavelet 
transform. 

Corollary 7. Let T C L 2 (R d ) be such that sup /&F ||/|| 2 < M < oo. If 

lim sup / \f(x)\ 2 dx = and lim sup / |/(0| 2 «^ = 0, 
r ^°°/e^J|x|>r f^ 00 ft? J\i\>P 

then T is totally bounded in L 2 (R d ). 

Proof. We show that T satisfies the conditions of Theorem for p = 2. Clearly, 
Conditions (i) and (ii) are among our assumptions, so we only need to prove (Hi). 
For / 6 T we find: 

|/(x + y)-/(z)| 2 dar= / |(e*» - l)/(0| a ^ 



< / |(e^-l)/(0r^ + 4 / 
J\i\<p J\i\>p 

< M 2 sup \e^' y - 1| 2 + e for p big enough 

\i\<p 

< M 2 p 2 \y\ 2 + £< 2e 

if \y\ < \f£j(pM). Here p, and hence the upper bound on \y\, can be chosen 
independently of /. This shows Condition (Hi) of Theorem [5] and finishes the 
proof. □ 

In the following result, Lf oc (tt) is equipped with the topology of LP convergence 
on compact subsets of ft. Recall that f2 is the countable union of compacts, e.g., 
Q, = K 1 UK 2 U.. . with K k = {x e O: |i| < k and dist(x, R n \Q) > 1/k}. Moreover 
any compact subset of fl is contained in some Kk, and so the topology on L^ oc (fl) 
is given by the countable family of seminorms \\f\\k — \\f\K k \\Lp{K k )- ^f oc (^) ls 
complete with respect to the metric (f,g) H> X^fcli min(2 _ ' t ', ||/ — g\\k). 

Corollary 8. Let ft C R™ be an open set. Write fx(x) = f(x) when x e K. 
/if (x) = otherwise. A subset JF C L^ oc (ri) is totally bounded if, and only if, the 
following holds: 

(i) For every compact K C SI there is some M so that 



J\f K (x)\ p dx<M, /eJ. 
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(ii) For every e > and every compact K C there is some p > so that 

J \fx(x + y)- f K (x)\ p dx < e p , / G F, \y\ < P . 

Proof. Note that JF is totally bounded in Lf oc (fl) if and only if Fk = {fK k '■ f G F} 
is totally bounded for every k, with as defined above. □ 

For the next result, recall that the Sobolev space W k ' p (M. n ) is defined to con- 
sist of those measurable functions / which, together with all their distributional 
derivatives D a f of order \a\ < k, belong to L p (R n ). Here a = (ai, . . . ,a n ) is 
a multi-index, i.e., each cxj is a nonnegative integer, \a\ = ot\ + • • • + a n , and 
jja _ q\o\/(Q x &i . . . dx" n ). Finally, W k ' p (R n ) is equipped with the complete norm 

ll/lkp = (/ Y,\ Da fW\ Pdx ) 1/P - 

jRn \a\<k 

Corollary 9. A subset T C W k,p (M. n ) is totally bounded if, and only if, the fol- 
lowing holds: 

(i) J- is bounded, i.e., there is some M so that 

J\D a f(x)\ p dx <M, feJ 7 , \a\<k. 

(ii) For every e > there is some R so that 

f \D a f(x)\ p dx <e p , /eJ, |a|<*. 

J\x\>R 

(Hi) For every e > there is some p > so that 

f \D a f(x + y)-D a f(x)\ p dx<e p , f G F, \a\ < k, \y\ < p. 

Proof. Note that T is totally bounded in W k ' p (R n ) if and only if D a [F] = {D a f: f G 
J 7 } is totally bounded in L p (M. n ) for every multi-index a with \a\ < k. □ 

4. A BIT OF HISTORY 

In 1931, Kolmogorov [TS] proved the first result in this direction. It characterizes 
compactness in L p (R n ) for 1 < p < oo, in the case where all functions are supported 
in a common bounded set. Condition (Hi) of Theorem [S] is replaced by the uniform 
convergence in L p norm of spherical means of each function in the class to the 
function itself. (Clearly, our Condition (ii) is automatic in this case.) 

Just a year later, Tamarkin [28j expanded this result to the case of unbounded 
supports by adding Condition (ii) of Theorem G3 

In 1933, Tulajkov [3T] expanded the Kolmogorov-Tamarkin result to the case 
p = l. 

In the same year, and probably independently, M. Riesz [55] proved the result 
for 1 < p < oo , essentially in the form of our Theorem [5j Thus we feel somewhat 
justified in using the names Kolmogorov and Riesz in referring to the theorem, 
though we are perhaps being a bit unfair to Tamarkin and Tulajkov in doing so. 

The compactness theorem has also seen generalizations in other directions. 

Hanson |15j proved a necessary and sufficient condition for compactness of a 
family of measurable functions on a bounded measurable set, with respect to con- 
vergence in measure. (Here the measurable functions form a metric space in which 
the distance between two functions is the infimum of all e > so that the two 
functions differ by at most e except on a set of measure < e.) 
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Frechet [T3] replaced Conditions (i) and (ii) of Theorcm[5]with a single condition 
("cquisummability"), and generalized the theorem to arbitrary positive p. 

Phillips |23J Thm 3.7] proved a necessary and sufficient condition for compactness 
in L p on a general measure space (1 < p < oo), and indeed in any Banach space, 
which is however somewhat less suited to applications to PDEs. Nevertheless, our 
sufficiency proof for Theorem [5] is based on Phillips' criterion. (It is more common, 
albeit more involved, to use mollifiers in the proof.) 

Weil [33] (see also [HI p. 269 ff]) extended the result to L P (G) where G is a 
locally compact group. Tsuji [3D] considered the case of L p (M. d ) with < p < 1, 
and Takahashi |27j studied the same problem in Orlicz spaces. A characterization 
of compact subsets of L p ([0,T]; B) (B a Banach space), which is very convenient 
in the context of time-dependent partial differential equations, is given by Simon 
[2"6] (see also [2D])- A readable account of some of the historical development can 
be found in [51 p. 388]. Helly's theorem [T^], which was published already in 1912, 
is easily seen to be a special case of Kolmogorov's compactness theorem in the 
one-dimensional case, see Section [SJ 

Further references include [32]. [T7]. [5], [S]. [TT]. [2T]. 

5. The Rellich-Kondrachov theorem 

In this section we use Kolmogorov's theorem to prove a simple variant of the 
Rellich-Kondrachov theorem [2UHFJ. Our simplification consists in avoiding bound- 
ary regularity conditions by working on the entire space K™. The standard Rcllich 
Kondrachov theorem requires a bounded region. The present version replaces this 
by a uniform decay estimate, specially tailored to fit the framework of the present 
paper. 

The Sobolev norm ||/|| 1)P on W 1 ^ p (W l ) is defined by 

\\f\\i, P =(j n (\f(x)\ p + \Vf(x)\ p )dx) , |V/| P =(]T|^- ) . 

According to the Sobolev embedding theorem, if p < n then W 1 ' p (W l ) C L q (R n ), 
and the inclusion map is bounded, for any q satisfying p < q < p* , where p* is the 
conjugate Sobolev exponent: 

111 

p* p n 

To see where this exponent comes from, consider a function / and its scalings 
f x (x) = f(x/X) where A > 0, and note that ||/ A || p = X n/p \\f\\ p and ||V/ A || p = 
A»/p-i||v/|| p , so the inclusion map W 1,p — > L q can only be bounded if there exists 
a constant C with \ n / q < C(X n / p + A n / p_1 ) for all A > 0. In the limits A -> oo and 
A — > we conclude n/q < n/p and n/q > n/p — 1 respectively. 

Theorem 10. Assume p < n and p < q < p* , and let J- be a bounded subset 
of W 1 ' p (M. n ) . Assume that for every e > there exists some R so that, for every 
f&J 7 , 

[ (\f(x)\ p + \X7f(x)\ p )dx<e p . 

J\x\>R 

Then T is a totally bounded subset of L q (W l ). 

Proof. We shall show that T satisfies the hypotheses of Theorem 03 with p replaced 
by q. We shall use the Sobolev embedding inequality < C||/||i )P , where 

the constant C depends only on p, q and n, and which is valid under the stated 
assumption, see [J 4.30 (p. 101) and Theorem 4.12 I C (p. 85) with j = 0, 
k — n, to — 1]. Condition (i) of Theorem [S] follows immediately from the Sobolev 
embedding inequality. Condition (ii) is almost equally immediate, from applying 
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the Sobolev embedding inequality to the function x i— > /(x)x(|x| — R), where x G 
C*°°(R), < x < 1, x(x) = for x < and = 1 for x > 1. 

If we apply the Sobolev embedding inequality to the function x i— > /(x/A) where 
A > and change variables in the resulting integrals, we obtain 

A' l/ l/|| g < C(A" / \f(x)\Pdx + X n - p [ \Vf(x)\Pdx) 1/P (1) 

We shall apply the above inequality not to /, but to x n- /(x + y) — /(x), where 
/• 

Now let £ > be given. By picking A sufficiently large we can ensure that 

C(\ n - p f |V/(x + y) - V/(x))|£ dx) VP < eA"/9 (2) 

«/ R TO 

for all / G J 7 , since the integral in this expression is bounded uniformly for / e J. 
Next, we find (using the Jensen and Holder inequalities, then Fubini's theorem) 

dx 



< 



|/(x + y)-/(x)|Pdx= / I y\7f(x + ty)dt 

\Vf(x + ty)\P p dxdt 

= \y\ p P - I \vm\*dx 

(where p and p' are conjugate exponents) for any test function /, and hence for 
any / S W 1,p . The integrals on the right-hand side of this inequality are uniformly 
bounded for f G J 7 , and so we can find some S > so that \y\ < 6 implies 

C(\ n f \f(x + y)- f(x)\Pdx) 1/P <eX n l". (3) 

For such y and /, (H| applied to x t-¥ f(x + y) — f(x) combined with © and to 
yield 

A" /9 ||/(-+y)-/(-)ll 9 <2 1/p eA"/«, 
and so assumption (Hi) of Theorem O is satisfied. □ 

6. Helly's theorem 

Helly's theorem is often referred to as Helly's selection principle, in order to 
avoid confusion with another theorem by Helly, stating that, given a collection of 
convex sets in R ra so that any n+ 1 of them have a point in common, then any finite 
sub-collection has nonempty intersection. Helly's selection principle is essentially a 
corollary of the Kolmogorov-Riesz theorem, though historically it was not derived 
that way. 

Recall that an integrable function / on the line is of bounded variation if it has 
finite essential or total variation, that is, if 

m 

TV(/) = sup^|/(x J+1 ) - /(x,-)| < oo, 
i=i 

where the supremum is taken over all finite partitions Xj < Xj+i such that each Xj is 
a point of approximate continuity of / (that is, <5 _1 |{x : |x— Xj\ < 5,\f(x) — f(xj)\ > 
e}\ for every e > as S — !> 0. See, e.g., [T01 p. 47]). We need a lemma: 

Lemma 11. Let u be function of bounded variation on R. Then 

/oo 
\u(x + y) - u(x)\dx < \y\ TV(u) 
- OO 

for all y € R. 
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Proof. We may assume y > without loss of generality. The calculation 





\u(x + y) — u{x)\ dx = / \u(x + jy + y) — u{x + jy)\ dx 

J = -oo J 




< / TV(u)dx = yTV{u), 



\u(x + (j + l)y) - u{x + jy)\dx 



□ 



Theorem 12 (Helly). Let (u n ) be a sequence of functions of bounded variation on 
the bounded real interval [a,b]. If there is a constant M so that T~V(u n ) < M and 
\\u n \\oc < M for alln, then there is a subsequence of (u n ) which converges pointwise 
everywhere and in L 1 norm in [a, b] to a function of bounded variation. 

Proof. Extend each function u n to all of K by setting it to zero outside [a,b]. By 
Lemma [TTJ the set of all the functions u n satisfy Condition (Hi) of Theorem [5] 
(with p = 1), while (i) holds by assumption and (ii) is trivial. Hence there is a 
subsequence of (u n ) which converges in I/ 1 ([a,6]). Moreover, integration theory 
tells us that we also get pointwise convergence almost everywhere, possibly after 
passing to a subsequence once more. However, this is not quite enough. 

Write instead u n = v n — w n where each v n , w n is an non-decreasing function: 
v n (x) is u n (a) plus the positive variation of u n on the interval [a, x], and w n (x) is 
the negative variation on the same interval. Then the sequences (v n ) and (w n ) both 
satisfy the conditions of the present theorem, and so, by the result of the previous 
paragraph, we may pass to a subsequence so that (v n ) and (w n ) both converge in 
L 1 ([a, b\), as well as pointwise almost everywhere. 

Let v be the limit of the sequence (v n ). Clearly, v is non-decreasing on the set 
where pointwise convergence holds, and so we may assume that v is non-decreasing 
everywhere, after possibly redefining it on a set of measure zero. 

Now it is clear that v n (x) — > v(x) for any point of continuity x for v: Given e > 0, 
pick 5 > so that \y — x\ < 5 implies \v(y) — v(x)\ < e, let x — S < y < x < z < x + 5 
with v n (y) — > v(y) and v n (z) — > v(z), and note that for n large enough we get 
v(x) — 2s < v(y) — e < v n (y) < v n (x) < v n (z) < v(z) + e < v(x) + 2e, so that 
\v n {x) - v(x)\ < 2s. 

Since v has at most a countable number of discontinuities, a diagonal argument 
yields a further subsequence which converges at all the discontinuities of v as well, 
and so we have pointwise convergence everywhere. 

In the same way we show that w n (x) —> w(x) for all x. Thus u n — > v — w 
pointwise, and v — w has bounded variation. □ 

Remark 13. The above proof is probably not the most natural one, but it does 
make clear the connection with the Kolmogorov-Riesz theorem. In a sense L 1 
convergence is irrelevant: Pointwise convergence is the key, and L 1 convergence 
follows from the bounded convergence theorem. 

It should be noted, however, that Helly's theorem, without pointwise conver- 
gence, is also true in higher dimensions |10| p. 176]. 



A recent generalization of Helly's selection principle (in one dimension) can be 
found in [25] . 
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